Hindi to Urdu Conversion: Beyond Simple Transliteration

نویسندگان

  • Bushra Jawaid
  • Tafseer Ahmed
چکیده

This paper incorporates a detailed analysis of existing work on Hindi to Urdu transliteration systems and finds the enhancements they required. It lists the issues that are beyond the scope of character by character mapping. The issues include multiple same sound Urdu characters against one Hindi character. Moreover, it deals with the issues when the same word or words are written in two different ways. The paper lists the differences in pronunciation, spelling and writing style. It presents solution to these issues that goes beyond transliteration.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of a Complete Urdu-Hindi Transliteration System

Hindi and Urdu are variants of the same language, but while Hindi is written in the Devnagri script from left to right, Urdu is written in a script derived from a Persian modification of Arabic script written from right to left. The difference in the two scripts has created a script wedge as majority of Urdu speaking people in Pakistan cannot read Devnagri, and similarly the majority of Hindi s...

متن کامل

Urdu Hindi Machine Transliteration using SMT

Transliteration is a process of transcribing a word of the source language into the target language such that when the native speaker of the target language pronounces it, it sounds as the native pronunciation of the source word. Statistical techniques have brought significant advances and have made real progress in various fields of Natural Language Processing (NLP). In this paper, we have ana...

متن کامل

Improving Machine Translation via Triangulation and Transliteration

In this paper we improve Urdu→Hindi English machine translation through triangulation and transliteration. First we built an Urdu→Hindi SMT system by inducing triangulated and transliterated phrase-tables from Urdu–English and Hindi–English phrase translation models. We then use it to translate the Urdu part of the Urdu-English parallel data into Hindi, thus creating an artificial Hindi-English...

متن کامل

Hindi-to-Urdu Machine Translation through Transliteration

We present a novel approach to integrate transliteration into Hindi-to-Urdu statistical machine translation. We propose two probabilistic models, based on conditional and joint probability formulations, that are novel solutions to the problem. Our models consider both transliteration and translation when translating a particular Hindi word given the context whereas in previous work transliterat...

متن کامل

A Hybrid Model for Urdu Hindi Transliteration

We report in this paper a novel hybrid approach for Urdu to Hindi transliteration that combines finite-state machine (FSM) based techniques with statistical word language model based approach. The output from the FSM is filtered with the word language model to produce the correct Hindi output. The main problem handled is the case of omission of diacritical marks from the input Urdu text. Our sy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008